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Abstract 

When one have to choose an option, he shows a strong tendency to choose the majority if he 
does not know the correct one. If each option has a multiplier m and the return for choosing a 
correct choice is set to be m, which option does one choose? Game theory predicts that the max- 
min strategy where one divides one's choice inversely proportional to m is optimal. We study the 
prediction by a voting experiment in which 50 to 60 subjects answer a two-choice quiz sequentially 
with and without information about prior subjects' choices. The information is given to the subjects 
in two patterns, C and M. In case C, the subjects know how many previous subjects have chosen 
each choice and the payoff for the correct choice is constant. In case M, each choice has a multiplier 
m that is inversely proportional to the number of prior subjects who have chosen it. The payoff 
for the correct choice is proportional to the multiplier. In case C, the probability of selecting a 
choice by the subject who did not know the correct choice rapidly increases as the proportion of 
subjects who have chosen it increases. In case M, the probability is inversely proportional to m for 
4/3 < m < 4. The subjects collectively adopt the Max-Min strategy in the range. The threshold 
value of the information cascade phase transition increases considerably as compared to in case C. 
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I. INTRODUCTION 



Even if each person has limited information, aggregated information becomes very accu- 
rate This is the wisdom of crowd effect, and it is supported by many examples from 
political elections, sports predictions, quiz shows, and prediction markets |2|-|4(. In contrast, 
in order to give accurate results, three conditions need to be satisfied: diversity, indepen- 
dence, and decentralization. If these conditions are not satisfied, aggregated information 
becomes unreliable or worse 0, [EJ . However, in an ever-more connected world, it becomes 
more and more difficult to retain the independence. Furthermore, if the actions or choices 
of others are visible, neglecting them is not realistic in light of the merit of social learning 
6|, 01. In this case, information cascade may emerge and information aggregation ceases 

iffi ■ 



More concretely, we consider a situation where people answer a two-choice question with 
choices A and B sequentially. Before this question is asked, many other people have already 
answered and their choices are made known as Ca people choosing A and Cb people choosing 
B, which is called social information. If the person answering knows the correct choice, he 
should choose it. His choice is not affected by social information. We then call him an 
independent voter. However, if he does not know the correct choice, he will be affected by 



social information [15[. He tends to go with the majority, and this is rational behavior. We 
then call him a herder, because he copies the majority. By rational herding, the wisdom of 
crowds is on the edge. If a herder is isolated from others, his choice becomes A and B and 
should be canceled. As a result, the choice by an independent voter remains. The majority 
choice always converges to the correct one in the limit of a large number of people. This 
is known as Condorcet's jury theorem [l|. However, if others' choices are given as social 
information, the cancellation mechanism does not work. The herder copies the majority 
and ignores the correct information given by the independent voter. If the proportion of 
herders p exceeds some threshold value p c , there occurs a phase transition from the one-peak 
phase where the majority choice always converges to the correct one to the two-peak phase 
where the majority choice converges to the wrong one with a finite and positive probability 



16J . We call this phase transition information cascade transition [17|, [18|. This is the risk 



of imitation in the wisdom of crowd. How can we avoid this risk? There exists a hint in 



race-track betting markets and prediction markets [19(. In order to aggregate information 



scattered among people, the market mechanism or an invisible hand can be very effective 
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We consider a situation in which each choice a G {A, B} has a multiplier M a that is 
inversely proportional to the number of subjects C a who chose it. The payoff for the correct 
choice is proportional to the multiplier. If the multiplier of a choice is large, the number 
of people who chose it is small. If the return is constant, it is not rational for a herder to 
choose the choice. However, now, the return on the correct choice is proportional to the 
multiplier, and hence we cannot say that it is not rational. Copying the majority gives him a 
small return, even if it is a correct choice. The multiplier plays the role of "tax" for herding 
(free rider) and copying the minority can be an attractive choice. By a Max-Min argument 
based on game theory j2l[, we can show that an optimized behavior is the one where a 
herder chooses a with a probability proportional to C a . We call the herder who adopts 
the optimized behavior an analog herder [22J. If the herder behaves an analog herder, the 
information cascade phase transition does not occur. Instead of the phase transition, the 
convergence speed phase transition occurs as the proportion of analog herders exceeds half 
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23|. The derivative of the convergence rate becomes discontinuous at this point. However, 
the majority of people always choose the correct choice in the limit of a large number of 
people and the system is in the one-peak phase for any value of the proportion. An invisible 
hand induced by multipliers automatically removes the risk of imitation in the wisdom of 
crowd. 

In this paper, we have adopted an experimental approach to study whether herders 
collectively adopt the optimized strategy and behave as analog herders if the choices have 
multipliers. We have also studied the information cascade transition of the system and the 
performance of the wisdom of crowd. The organization of the paper is as follows. We explain 
the experiment and the optimized strategy in section [Til The subjects answer a two-choice 
quiz in three cases r G {O, C, M}. In case O, the subjects answer without social information. 
In cases C and M, they receive social information based on previous subjects' choices. Social 
information is given as summary statistics {Ca, Cb} in case C and as multipliers {M A , M B } 
in case M. Sections II III and IIVI are devoted to the analysis of the experimental data. In 
section IIII[ we study the macroscopic aspects of the system. As the proportion of herders 
approaches 100%, the convergence of the sequence of choices becomes extremely slow and 
information aggregation almost ceases in both cases r G {C, M}. In section HVl we derive a 
microscopic rule regarding how herders copy others in each case r G {C, M}. In section[V], we 
introduce a stochastic model that simulates the system. We study the information cascade 
phase transition and the performance of herders in the system. Section IVII is devoted to 
the summary and discussions. In the appendices, we give some supplementary information 
about the experiment and the estimation procedure of the parameters in section IIVI 



II. EXPERIMENTAL SETUP AND OPTIMIZED SRATEGY IN CASE M 



A. Experimental design 

The experiment reported here was conducted at the Group Experiment Laboratory of 
the Center for Experimental Research in Social Sciences at Hokkaido University. We have 
conducted two experiments and call them EXP-I and EXP-II. In EXP-I (II), we recruited 120 
(104) students from the university. We divided them into two groups, Group A and Group B, 
and prepared two sequences of subjects of average length 60 (50). The subjects answered a 
two-choice quiz of 120 questions sequentially. We label the questions by % G {1, 2, • • • , 120}. 

In EXP-I, the subject answers in three cases r G {0,C,M} in this order. We denote 
the answer to question i in case r after t — 1 subjects' answers by X(i, t\r), which takes the 
value 1 (0) if the choice is true (false). {Co(i, t\r), C*i(i, t\r)} are the numbers of subjects 
who choose true and false for question i among the prior t subjects as 

t 

C 1 (i,t\r) = J2 X (h t '\r), 
t'=i 

C (i,t\r)=t-C 1 (i,t\r). (1) 

In case O, the subject answered without any social information. Then, he answered in 
case C. Before him, t — 1 subjects answered question i, and he received summary statistics 
{Ca(i, t—l\C), Csih t—l\C)} from them. For the correct choice in cases O and C, the subject 
gets two points. Finally, in case M, the subject receives multipliers {M^(«, t—i), Mb(i, t— 1)} 
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Case O 

No Information 




X(i,t\0) 



X(i,t f \C) 



X(i,f\M) 



FIG. 1. Pictorial explanation of the experimental procedure. There are 120 questions in the quiz, 
labeled by i G {1, 2, ■ ■ ■ , 120}. A subject answers question i in three cases r E {O, C, M} in this 
order. If the answer was given after t — 1 subjects, it is denoted as X(i,t\r). 

from all previous t — 1 subjects. For the correct choice, the subject gets the points which 
is given by the multiplier. The multiplier M a for a G {A,B} was calculated based on the 
summary statistics in case M as 




MJi,t-l) 



C A (i,t - 1\M) + C B (i,t - 1\M) + 1 
C a {i,t-1\M) + 1 

(2) 



C a (i,t-1\M) + 1' 

The multiplier comes from that each subject choice values as 1 with the total points Ca + 
Cg + 1 = t being divided among C a + 1 subjects who have chosen a. This is similar to 
the payoff odds of the parimutuel system in gambling. Figure [TJ displays the experimental 
design. 

In EXP-II, in addition to the three cases r G {O, C, M}, the subjects answered in at most 
four cases r G {1,5,11,21} between cases O and C. In case r G {1,5,11,21}, the subject 
received summary statistics {C^^t — l\r),CA(i,t — 1 |r) } from his previous r subjects. 
Ca(«, t — l|r) + Csih t — l|r) = r holds and as r increases, the amount of social information 
increases. In EXP-I, the amount of social information increases rapidly from r = in case 
O to r = t — 1 in case C. In EXP-II, r gradually increases. The payoff for the correct 
choice is 1 in case r G {0, 1, 5, 11, 21, C} and the multiplier in case M. Detailed information 
about EXP-II has been presented in our previous work T^, where we have studied the 



experimental data for cases r G {0, 1,5, 11,21,C}. In this paper, we concentrate on cases 
r G {0,C,M}. 

There were two groups (A and B) of subjects and we repeated the same experiment. We 
obtained 120 x 2 sequences X(i, t\r) for each r G {O, C, M}. We label the sequence in group 
B by i + 120, so that i G {1, 2, ■ ■ • , 240}. We denote the length of sequence {X(i, t\r)} by 
Ti. 
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TABLE I. Experimental design. T means the number of subjects and {r} means the cases where 

the subjects answered the quiz. / means the number of questions. 

Experiment Group T {r} I 

EXP-I A 57 {0,C,M} 120 

EXP-I B 63 {0,C,M} 120 

EXP-II A 52 {0, 1, 5, 11, 21, C, M} 120 

EXP-II B 52 {0, 1, 5, 11, 21, C, M} 120 

B. Experimental procedure 



Voting Experiment 



|No. Info. 

| Q. 30 :Hhich composer is famous for the Symphonie No. 6 Pathetique 7 



A : Beethoven 


B : Tchaikovsky 







Answer 



Voting Experiment 



| All previous subjects' Info. 

Up to now 9 subjects have answered. 

Their choices are as follows. Please choose. 

| Q . 39 : Which composer is famous for the Symphonie No. 6 Pathetique ? 



A : Beethoven 


B : Tchaikovsky 


B 


1 







I Answer 



Voting Experiment 



|payoff Odds Info. 

Up to now 9 subjects have answered. 

Their choices are given as Multipliers as folows. 

If your choise is true, the points earned is multipled by the Multiplier. 

Even if a choice with large mutiplier is more likely to be wrong, it is rational to choose it with the objective of 
expected return. Please choose. 

| Q . 30 : Which composer is famous for the Symphonie No. 6 Pathetique ? 



A : Tchaikovsky 


B : Beethoven 


x5 


X1.1 







Answer 



FIG. 2. Snapshot of the screen for cases 0,C and M. Summary statistics {Ca,Cb} (multipliers 
{Ma, Mb}) are given in the second row in the box in case C{M). 

We explain the experimental procedure in EXP-I in detail. There were five sessions 
for each group. In one session, 10 to 13 subjects entered the laboratory and sat in the 
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partitioned spaces. After listening to a brief explanation about the experiment and the 
reward, in particular about the multiplier, they logged into the experiment web site using 
their IDs and started to answer the questions. Interaction between subjects was permitted 
only through the social information given by the experiment server. A question was chosen 
by the experiment server and displayed on the monitor. First, subjects answered the fist 
half of the 120 questions % G {1,2, •• • ,60} using their own knowledge only (r = O). If a 
subject answers after t subjects, it is denoted as X(i,t + 1 10) . After answering all the sixty 
questions in case O, the subjects answered the same 60 questions in case C. For question i, 
the order t' that the subject answer in case C is in general different from the order t in case 

(see Figured]). Finally, the subjects answered the same questions in case M. The order t" 
of the subject to question i is in general different from t and t' in the previous two cases. In 
each case, the experiment server chose a question among the sixty questions at random that 
was not served to the other subjects. After a five-minute interval, we repeated the same 
procedure so that the subjects answered all 120 questions. By performing five sessions, we 
have gathered data for 57 (63) subjects in group A (B) 

Figure [2] shows the experience of the subjects. In the example covered in the figure, 
already nine subjects have answered question 30. In case O, no social information is given. 
The subject chooses among the two options. In case C, the summary statistics from the 
previous nine subjects' choices are displayed in the second row of the box. Taking into 
account this information, the subject makes a choice. In case M, the multipliers are given 
in the second row along with the number of subjects who answered the question. Only one 
subject among nine has chosen A and remaining eight subjects have chosen B. Multiplier 
M A (M B ) is calculated as 10/(1 + 1) = 5 (10/(8 + 1) = 1.1). The multipliers are rounded off 
to one decimal place. In EXP-II, the experiences of the subjects are almost the same as those 
of the subjects in EXP-I The difference lies in the subjects answering each question 
from r = O to r = M before proceeding to another question. Accordingly, t — t' — t" holds. 
In EXP-II, the subjects were likely to easily remember the answers for the earlier cases with 
less social information and be careful in choosing answers in the later cases with more social 
information. In order to exclude such an effect, we changed the system to that in EXP-I. 

C. Max-Min Strategy in case M 

We discuss what is the optimized strategy for herders in case M. In the experiment, a 
subject can choose a G {A, B}. We suppose that he/she votes one unit for a choice and call 
him/her a voter. Here, we consider the case where one vote can be divided by the voter. 
If a voter believes A is correct, he/she votes one unit for A. If a voter does not know the 
answer at all, he/she votes 0.5 unit for A and 0.5 unit for B. The multiplier for choice 
a G {A, B} is M a . We assume that a voter thinks the probability that A is correct is /3, and 
the probability that B is correct is 1 — 0. The voter divides one unit vote into x for A and 

1 — x for B by his/her decision making. Expected return R is 

R = /3 • M A ■ x + (1 - p) ■ M B - (1 - x) 

= /3(M A x- M B (l-x)) + M B (l-x). (3) 

We assume that herders do not have information about the correct answers without mul- 
tiplier {M Q }, a G {A, B}. Hence, we assume that a herder cannot estimate the probabilities 
of correct answers (3 as Knightian uncertainty, because a herder has no knowledge to an- 
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swer the questio n |24l| . In this circumstance, the max-min strategy is proved to be optimal 
in game theory [2l|. The voter minimizes the expected loss due to the uncertainty in the 
choice. In order to minimize the expected loss from the uncertainty, it should be chosen so 
that M A • x = M B ■ (1 — x) holds, from (J3D- This position has no sensibility for /3. 



We can calculate x from 



x = U , • ( 4 ) 



M B + M A 



As multiplier M a is calculated as 

M, 



a 



C a + 1 



ratio x for A is then 

Ca + 1 C A 

x = ~ — tor t » 1. 5 

t + 2 t v ' 

x becomes proportional to C A and it is the voting strategy of analog herders. 

In our experiment, it is not possible to realize the optimal mixed-strategy at the individual 
level, because the voter cannot divide one's vote (choice). It can be realized only collectively. 
Hence, the averaged behavior of herders becomes akin to that of the analog herders, when 
herders adopt the optimized strategy. 



III. DATA ANALYSIS : MACROSCOPIC ASPECTS 



We obtained 240 sequences {X(i,t\r)},t G { 1, 2, • • • , Tj} for question i G {!,-•• ,240} 



and case r G {0,C,M} in each experiment [25J. The percentage of correct answers of 



sequence {X(i,t\r)} for question i in case r is defined as Z(i\r) = X^=i s\r)/Ti. In 
the analysis, the subjects are classified into two categories - independent and herder - for 
each question. We assume that the probability of correct choice for independent and herder 
subjects is 100% and 50%, respectively [l6|. For a group with p(i) herder and 1 — p(i) 
independent subjects, the expectation value of Z(i\0) is 1— p(i)/2. The maximal likelihood 
estimate of p(i) is given as p(i) = 2(1 — Z(i\0)). The assumption of the random guess (50%) 
by the herder might be too simple. As Z(i\0) approaches 0.5 and almost all subjects do not 
know the answer to the question, p(i) approaches 100% and the estimate works well. 



A. Distribution of Z(i\r) 

There are 240 samples of sequences of choices for each r. We divide these samples 
into 11 bins according to the size of Z(i\r), as shown in Table [Til The samples in each 
bin of case O share almost the same value of p. For example, in the samples of No. 6 
bin (0.45 < Z(i\0) < 0.55), there are almost only herders in the subjects' sequence and 
p(i) ~ 100%. In contrast, in the samples of No. 11 bin (Z(i\0) > 0.95), almost all subjects 
know the answer to the questions and are independent (p(i) — 0%). An extremely small 
value of Z(i\0) indicates some bias in the question. In addition, the minimum value of Z(i\r) 
should be 1 —p(i). We omit the samples that satisfy Z{i\0) < 0.45 or Z{i\C) < 1 — p(i) 
or Z(i\M) < 1 — p(i). By these procedures, we are left with 167 (177) samples in EXP-I 
(II) and we denote the set by I'. /(No.) denotes the set of samples in each bin in case O 
among /'. The samples with Z{i\0) < 0.5 in 1(6) have p(i) values larger than 100%. These 
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TABLE II. Effect of social information on subjects' decisions. We divide the samples according 
to the size of Z(i\r). iV(No.|r) denotes the number of samples for case r in each bin. /(No.) is 
the set of sample i in each bin of case O after removing the samples that satisfy Z(i\0) < 45% or 
Z(i\C) < (1 — p(i)) or Z(i\M) < (1 —p(i)). We use the same notation, /(No.), for the number 
of samples in the set. p aV g is estimated as the average value of p(i) = 2(1 — Z(i\0)) over the 
samples in /(No.). In the last two columns, the sub-optimal ratios for the samples of /(No.) in 
case r £ {C, M} are shown. 



EXP-I 



No. Z(i\r)[%] 


iV(No. 


0) iV(No. 


\C) N(No.\M) /(No/ 


?WNo.)[%] 


Z{i\C) < 1/2 Z 


>'|M) < 


1 < 5 





5 





NA 


NA 


NA 


NA 


2 5 - 15 


3 


33 


7 


NA 


NA 


NA 


NA 


3 15 - 25 


5 


28 


25 


NA 


NA 


NA 


NA 


4 25 - 35 


18 


9 


30 


NA 


NA 


NA 


NA 


5 35 - 45 


35 


5 


13 


NA 


NA 


NA 


NA 


6 45 - 55 


38 


5 


13 


38 


97.5 


18/38 


17/38 


7 55 - 65 


57 


5 


14 


52 


78.3 


7/52 


5/52 


8 65 - 75 


29 


7 


19 


26 


60.3 


0/26 


0/26 


9 75 - 85 


41 


17 


44 


38 


40.6 


0/38 


0/38 


10 85 - 95 


11 


57 


62 


11 


21.3 


0/11 


0/11 


11 > 95 


3 


69 


13 


2 


5.1 


0/2 


0/2 


Total 


240 


240 


240 


167 


66.8% 


35/167 


22/167 



EXP-II 



No. Z(i\r)[%] 


N(No. 


O) N(No. 


C) N(No.\M) /(No.; 


Pavg(N0.)[ 


%] Z(i\C) < 1/2 Z(i\M) < 


1 < 5 





2 





NA 


NA 


NA 


NA 


2 5-15 





18 


6 


NA 


NA 


NA 


NA 


3 15-25 


8 


22 


18 


NA 


NA 


NA 


NA 


4 25 - 35 


16 


20 


23 


NA 


NA 


NA 


NA 


5 35 - 45 


36 


8 


16 


NA 


NA 


NA 


NA 


6 45 - 55 


13 


9 


19 


43 


96.7 


16/43 


15/43 


7 55 - 65 


46 


10 


16 


45 


79.3 


8/45 


3/45 


8 65 - 75 


15 


11 


26 


15 


62.7 


2/45 


0/45 


9 75 - 85 


33 


33 


56 


33 


41.9 


0/33 


0/33 


10 85 - 95 


11 


67 


54 


11 


21.3 


0/11 


0/11 


11 > 95 


2 


37 


6 





NA 


NA 


NA 


Total 


240 


240 


240 


177 


68.7% 


26/177 


18/177 



values are errors of the estimation p(i) = 2(1 — Z(i\0)). The standard deviation of Z(i\0) is 
yJp{i)/ATi. In the estimation of p(i), there is fluctuation with the magnitude of y/p(i) ~~[T%- 
If p{%) takes a value larger than 100%, we take it to be 100%. 

Table HT1 shows the number of data samples in each bin for case r e {O, C, M} as A^(No. |r). 
Social information causes remarkable changes in subjects' choices. For case O, there is one 
peak at No. 7, and for case C(M), there are peaks at No. 2 (4) and No. 11 (10) in EXP-I. 
We calculate the average value of p(i) for the samples in /(No.|0). We denote it as p aw£ ,(No.) 
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and estimate it as 

V ie/(No.) 

Here, /(No.) in the denominator means the number of samples in /(No.), which is given in 
the sixth column of the table. We show the results in the seventh column. In the last two 
columns, we show the ratio of sub-optimal cases {Z(i\r) < 1/2} for r £ {C,M} among the 
samples in /(No.|0). In both cases, as p avg increases, the sub-optimal ratio increases rapidly 
to about half. 




Z(i|0) 



FIG. 3. Scatter plots of Z{i,T^O) vs Z(i,Ti\r) for (A) Case C and (B) Case M. The vertical 
lines show the border of the bins in Table HH The rising diagonal line from (0.5,0) to top right 
shows the boundary condition Z(i\r) = 1 — p. 



In order to see the social influence more pictorially, we show the scatter plots of Z(i\0) vs 
Z(i\r), r £ {C, M} of EXP-I in Fig. |3j The x-axis shows Z(i\0) and the y-axis shows Z(i\r). 
The vertical lines show the boundary between the bins (from No. 1 to No. 11) for case O 
in Table HT1 The rising diagonal line from (0.5.0) to top right shows the boundary condition 
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Z(i\r) = 1 — p. If subjects' answers are not affected by social information, data would 
distribute on the diagonal line from (0, 0) to top right. As the plots clearly indicate, the 
samples scatter more widely in the plane in case C than in case M, which means that social 
influence is bigger in case C. For the samples with Z(i\0) > 0.65 in case O (Nos. 8, 9, 10 
and 11 bins in Table Hill . the changes, Z (i\C) — Z (i\0) , are almost positive and Z(i\C) takes a 
value of about 1 in case C. In case M, the changes, Z(i\M) — Z(i\0), are also almost positive 
and Z(i\M) takes a value of about 0.9. Average performance improves by social information 
for the samples in both cases. In contrast, for the samples with 0.45 < Z(i\0) < 0.65 (Nos. 6 
and 7 bins in Table HB . social information does not necessarily improve average performance. 
There are many samples with Z(i\r) — Z(i\0) < in both cases. These samples are in the 
sub-optimal state and constitute the lower peak in Table [III 



B. Asymptotic behavior of the convergence 

We have seen drastic changes in the distribution of Z(i\r) from the distribution of Z(i\0). 
Table HT1 and Figure [3] show the two-peak structure in the distribution of Z(i\r). In our 
previous work on the information cascade phase transition fl6j |. we have studied the time 
dependence of the convergence behavior of the sequences {X(i,t\r)}. We denote the ratio 



of correct answers, as 



Zli, t\r) = 



t t 



^X(M|r). (6) 



Z(i,Ti\r) = Z(i\r) holds by definition. By studying the asymptotic behavior of the con- 
vergence of sequence {Z(i,t\r)} for the samples in /(No.), one can clarify the possibility of 
the information cascade transition by varying p. The variance of Z(i, t\r) for the samples in 
/(No.) is defined as 



Var(Z(z,t|r)) Na 

JfSoT £ (Z(i,t\r)-<Z(i,t\r)> 
V ie/(No.) 



No J 



< Z(i,t\r) > No = Z(i } t\r). (7) 

' ; ie/(No.) 



Here, we denote the average value of Z(i, t\r) over the samples in /(No.) by < Z(i, t\r) >n q • 
In the one-peak phase, Var(Z(i, t\r)) converges to zero in thermodynamic limit t — > oo. 
Dep ending on the convergence behavior, the one-peak phase is classified into two phases 
If the variance of Z(i,t\r) shows normal diffusive behavior as Vax((Z(i,t\r)) oc it 



is called the normal diffusion phase. We note that the variance is estimated for the ratio, 
Ci(i,t\r)/t, and the usual behavior t 1 for the sum of t random variables is replaced by 
oc t/t 2 = If convergence is slow and Vax(Z(i,t\r)) oc t~ 7 with < 7 < 1, it is called 



the super diffusion phase |26J. In the two-peak phase, Var(Z(z, t\r)) converges to some finite 
value in limit t — > 00 (l7j . 

Figure H] shows the double logarithmic plots of Var(Z(z, £|r))jsj as a function of t. If the 
plot has a negative slope —7 with 7 > in limit t — > 00, the system is in the one-peak phase. 
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FIG. 4. Convergent behavior. Cconvergence is given by the double logarithmic plot of 
Var(Z(i,t\r))^ Q vs t using the samples in the four bins (No. 6 (o), 7 (A), 8 (o) and 9 (x) 
in Table P) for (A) Case C in EXP-I, (B) Case M in EXP-I, (C) Case C in EXP-II, and (D) Case 
M in EXP-II. The dotted lines are fitted results with oc i" 7 for t > 10(20) in EXP-I (II). 



If slope is zero and 7 = in the limit, the system is in the two-peak phase. We see that 
convergence becomes very slow as p aV g(No.) increases in general. Exponent 7 is estimated 
by fitting with oc t -7 for t > 10 in EXP-I. It decreases from almost 1 to —0.02(0.14) with 
an increase in p avg in case C (M). For the samples in 1(9) and 1(8), 7s are almost 1 and 
the system is in the normal diffusion (one-peak) phase in both cases r G {C,M}. For 
the samples in 1(7), 7s are apparently smaller than 1 and the system might be in the super 
diffusion phase. However, system size T is very limited and in thermodynamic limit T — > 00, 
7s might converge to 1. For the samples in 1(6), 7 becomes negative (7 = —0.02) in case C. 
This suggests that the system is in the two-peak phase for the samples in 1(6) and threshold 
value p c is between 78.5% and 97.5%. In case M, 7 is negative even for the samples in 
1(6) and the system might be in the super diffusion phase. However, the result does not 
necessarily deny the existence of the two-peak phase, taking into account the estimate error 
of p(i) and the estimate error of 7 from the limited sample size. We can only say that if the 
two-peak phase exists, threshold value p c in case M is considerably larger that that in case 
C. 
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IV. DATA ANALYSIS: MICROSCOPIC ASPECTS 



In this section, we study the microscopic aspects of herders. We clarify how they copy 
others' choices and derive a microscopic rule in each case r G {C,M}. In particular, we 
study whether they behave as an analog herder like in case M. 



A. How do herders copy others? 

We determine how a herder's decision depends on social information. For this purpose, 
we need to subtract independent subjects' contribution from X(i,t + l\r). The probability of 
being independent is 1 —p(i), and such a subject always chooses 1. A herder's contribution 
is estimated as 

(X(i,t+l\r)-(l-p(i)))/p(i). 

How the herder's decision depends on Cx(i,t\r) = n\ is estimated by the expectation value 
of (X(i,t + l|r) — (1 —p(i))/p(i) under this condition. The expectation value means the 
probability that a herder chooses an option under the influence of the prior ri\ subjects 
among t who choose the same option. We denote it by q h (t,ni\r), and estimate it as 



E 



X (i,t+l\r)-(l-p(i)) 
p(i) 



°Ci(i,t\r),m 



qh(t,n 1 \r) = ^ - . (8) 



Eie/' 3ci(i,t\r),ni 

Here, S^j is 1 (0) if % — j (i ^ j) and the denominator is the number of sequences where 
Ci(z, t\r) = rii. From the symmetry between 1 -h- 0, we assume that qh(t, n\) = 1 — qh(t, t — 
rii). We study the dependence of qh(t,rii) on n\/t and round n\/t to the nearest values in 
{Jfc/13(12)|Jfe G {0, 1, 2, • • • , 13(12)}} in EXP-I (II). 

Figure [5] shows the plot of qh(t, ni\r) for (A) case C and (B) case M. We can clearly see 
the strong tendency to copy others in case C. As n\jt increases from 1/2, qh(t, n\\C) rapidly 
increases and the slope at n\jt = 1/2 is about 2.0 in EXP-I. Such nonlinear behavior is known 
as quorum response in social science and ethology [27| • The magnitude of the slope measures 
the strength of herders'response. Comparing EXP-I and EXP-II, the response of the herder 
is more sharp in EXP-I than in EXP-II. In EXP-II, where the amount of social information 
increases gradually, the subjects tend to copy others' choices more prudent than in EXP-I. If 
the slope exceeds 1, the system shows the information cascade phase transition. Transition 
ratio p c depends on the slope. In the digital herder case, where qh(t,ni) = 9{n\ — 1/2) and 



the slope is infinite, p c takes 0.5 (17J. As the slope reduces to 1, p c increases to 1 and the 



phase transition disappears in the limit [23 



Contrary to case C, the dependence of qh(t, n\\M) on m/t is weak and the slope at 
ni/t = 1/2 is almost 1 in case M. In range 1/4 < n\/t < 3/4, qh(t, n\\M) lies on the 
diagonal dotted line and the herders almost behave as analog herders. They collectively 
adopted the optimal mixed-strategy. As the slope at n\/t — 1/2 is small, if the information 
cascade phase transition occurs, transition ratio p c should become large as compared to in 
case C. The wrong convergence of the majority choice occurs only when almost nobody 
knows the correct answer in case M. One can also see an interesting behavior of herders. 
If minority choice ratio ri\/t is smaller than 1/4 and multiplier m exceeds 4, some herders 
make the choice. As a result, if xi\jt > 3/4, qh(t,rii\M) becomes almost constant, about 
3/4. We can interpret this as some of the herders prefering big multiplier (long-shot) and 
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FIG. 5. Microscopic rule of herder's decision for (A) Case C and (B) Case M. It shows the 
probability qh(t,ni\r) that a herder chooses an option under the influence of the prior n\ subjects 
among t choosing it in case r. The solid curves are fitted results with Eq.©. The dotted diagonal 
line shows the analog herder model qh(t,ni) = n\/t. The thin dashed line in (A) shows 2{n\/t — 
1/2) + 1/2. 



qh{t,rii) saturating at 3/4. This behavior is not rational and is known as favorite- longshot 
bias in race-track betting markets 19 . 
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B. Average herders and logistic herders 



As in our previous work 16j, we model the behavior of herders by the following functional 
form: 

q h (t, m \r) = - (a r tanh(A r (n 1 /t - 1/2)) + 1) . (9) 

Parameters a r and A r indicate the strength of the conformity of subjects. a r is the net ratio 
of herders who react positively to prior subjects' choices [28[. A r denotes the strength of the 
response on social information. Combined factor \a r \ r is the slope of the functional form 
at ni/t = 1/2. We call the herder who chooses according to eq.® a logistic herder, and 
this model as the logistic herder (LH) model. In addition, we introduce an average herder 
(AvH) model where qh{t,rii\r) is given by linear extraporation of the values in eq.flBD- We 
describe the estimation procedure of parameters {A r , a r } of the LH model in Appendix C. 
The estimation results of the parameters, logL (eq JC4p and AIC (eq.( 1C5|) ). for the LH model 
are given in Table IIHI The fitted results are shown in Fig. |5j For comparison, we show the 
log likelihood logL and AIC of the AvH model. The AvH model has seven (six) parameters 
as k e {7, 8, 9, • • • , 13 (12)} in EXP-I (II). 



TABLE III. Parameter estimates for the logistic herder (LH) model. We show the fitted results, 
X r ,a r , of the LH model for case r £ {C,M} in EXP-I and EXP-II. The fourth column shows the 
number of observations and the fifth column shows the number of parameters N p of the model. In 
the eighth column, logL is given. The last column shows AIC. For comparison, we show logL and 
AIC for the average herder (AvH) model. 



EXP 


r 


Model #Obs N p 


A r a r 


logL AIC 


I 


c 


AvH 


9850 


7 


NA NA 


-2951.6 5917.1 


I 


c 


LH 


9850 


2 


3.58 0.879 


-2951.5 5907.1 


I 


M 


AvH 


9819 


7 


NA NA 


-4592.5 9198.9 


I 


M 


LH 


9819 


2 


4.73 0.544 


-4601.6 9207.2 


II 


C 


AvH 


8809 


6 


NA NA 


-3468.8 6949.5 


II 


C 


LH 


8809 


2 


3.59 0.785 


-3470.6 6945.3 


II 


M 


AvH 


8809 


6 


NA NA 


-4406.6 8825.3 


II 


M 


LH 


8809 


2 


5.88 0.469 


-4418.3 8840.5 



In case C, logL of the LH model is comparable to that of the AvH model. The LH 
model describes the behavior of the herders as well as the AvH model. As Np of the LH 
model is smaller than that of the AvH mode, the LH model is better than the AvH model 
from the viewpoint of AIC. The slope at n\jt = 1/2 in the LH model is estimated as 
\ac\c — 1.57(1.41) in EXP-I (II). In contrast, in the AvH model, the slope is roughly 2 
in EXP-I (see Figure [5A). As the slopes are larger than 1, the information cascade phase 
transition occurs at p c < 100%. As the value of the slope is crucial in the estimation of 
system performance, we study q h (t,ni\C) at n\jt = 1/2 in detail in the next subsection. We 
defer which model to adopt in case C to the place. 

In case M, the LH model does not describe herders' behavior well. Comparing logL 
of the two models, we see that the AvH model is more faithful to herders' behavior. A 
comparison of AIC yields that the AvH model is better than the LH model. As one can see 
clearly from Figure 03, the slope at n\/t = 1/2 is almost 1, and at n\/t = 0.7, qh(t, n\\M) 
passes the diagonal line. It looks linear or downward convex in region 1/2 < ri\/t < 3/4. In 
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FIG. 6. Plots of fe(ac|r) vs x for r G {C,M}. We use all data {X(i,i|r)} from EXP-I and EXP-II 
that satisfy |Ci(z,i - l|r)/(t - 1) - 1/2 1 < sc, and fit & in g ft (t,m|r) = k(m/t - 1/2) + 1/2 by 
the maximum likelihood estimate. The thick solid (dashed) line plots the results for case C (M). 
k = 1 (thin dotted line) corresponds to analog herder qh{t,n{) = 1 • ni/t. 

contrast, the LH model is upward convex for m/t > 1/2, and the fitted result suggests that 
the slope at m/t = 1/2 is about 1.28 (1.38), by the estimation of \clm^m in EXP-I (II). 
These discrepancies are crucial and we adopt the AvH model to describe herders' behavior. 

C. Behavior of qh(t, n\\r) near m/t = 1/2 

In case C, the LH model and the AvH model are equally faithful to describe the behavior 
of herders as they have almost the same logL. There are discrepancies in the slope of 
qh{t,rii\C) at m/t = 1/2. The magnitude of the slope is crucial to the properties and 
performance of the system. However, from the cascade effect, the samples move to the edges 
of [0, 1] on the m/t axis. In region rii/t ~ 1/2, the number of observations is small and the 
maximum likelihood estimate in the previous subsection does not capture the behavior near 
rii/t = 1/2. We study qh(t,m\r) in the region in detail. As we are interested in the slope of 
qh(t,rii\r) at m/t = 1/2, we assume the following functional form with slope parameter k, 

q h {t,nx\r) — k ■ (ni/t — 1/2) + -. 

Here, k is the slope at n\jt = 1/2. We consider a region [1/2 — x, 1/2 + x] with a small 
and positive parameter x < 0.3. We fit k by the maximum likelihood method and write 
the result as k(x\r). In the estimation of / in eq. (lC3p . we use only the data {X(i,t\r)} that 
satisfy |Ci(z,t - l\r)/(t - 1) - 1/2| < x. We combine the data from EXP-I and EXP-II to 
address the data scarcity for small x. 

Figure [6] plots k(x\r) vs x for r £ {C, M}. We are interested in the slope at m/t = 1/2 




1(3 



and it is necessary to see limit x — > 0. As x decreases, the number of observations decreases 
and the estimation error increases. We set x > 0.04. In case C, k(x\r) increases almost 
monotonically as x decreases. It takes about 2 at x = 0.05 and coincides with the behavior of 
qh{t, Ti\\C) in FigureOA.. The AvH model describes herders' behavior of herder for x > 0.05. 
We adopt the AvH model to describe herders' behavior in case C. In case M, k(x\r) 
starts from about half at x — 0.04 and rapidly increases to about 1 at x = 0.1. In range 
0.1 < x < 0.25, k(x\r) is almost 1 and near x = 0.2 it exceeds 1 slightly. These behaviors 
reflect that quit, ri\\M) is downward convex in range 0.5 < xi\jt < 0.55 and upto ri\jt = 3/4, 
qh(t,rii\M) almost behaves as n\jt. As in case C, we adopt the AvH model to describe 
herders' behavior. 



V. STOCHASTIC MODEL AND SIMULATION STUDY 

The asymptotic analysis of the convergence of Z(i,t\r) in case C shows the possibility of 
the two-peak phase for the samples in 1(6). Exponent 7 is remarkably small, —0.02 (0.06) 
in EXP-I (II). In case M, the exponent of convergence for the same samples is negative, 
0.14 (0.16), in EXP-I (II). However, the system sizes are limited and this causes the estimate 
error in p(i). In addition, the estimate error from the very limited number of samples in 
each bin cannot be neglected. In this section, based on the herders' microscopic rule derived 
in the previous section, we simulate the system. We estimate the transition ratio p c (r) of 
the information cascade phase transition. In addition, we compare the performance of the 
system in case C and that in case M. 



A. Voting model and transition ratio p c 

To understand the behavior of the system in thermodynamic limit T — > 00, we simulate 
the system for large T by a stochastic model based on eq.flS]). We introduce a stochastic 
process {X(t\r,p)},t E {1,2,3, ••• ,T} for r E {C,M} and p E [0,1]. X(t + l\r,p) E {0,1} 
is a Bernoulli random variable. Its probabilistic rule depends on C\(t) = Ylt'=iX{t'\r,p) 
and the herders' proportion p. Given {Ci(t) = n{\, the probabilistic rule that X(t + l\r,p) 
obeys is 

Prob(A(t + l\r,p) = l\ni) = (1 — p) + p ■ qh(t, ni\r) 

= q(t,ni\r,p), 

Prob(X(t + l\r,p) = 0|m) = p ■ (1 - q h (t, n^r)). (10) 

Here, we denote the probability that X(t + l\r,p) takes 1 under the condition by q(t,rii\r,p). 
We adopt the AvH model for qh(t,n\\r) and use the results of the experiments (Figure |5]). 
We denote probability function Prob(Ci(£) = n) for r and p as P(t,n\r,p). The master 
equation for P(t,n\r,p) is 

P(t + 1, n\r,p) = q(t, n — l\r,p) ■ P(t, n — l\r) 

+ (1- q(t,n\r,p)) ■ P(t,n\r). (11) 
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The expected value of Z(t\r,p) = \C\{t) is then estimated as 

t 

E(Z(t\r,p)) = ^2P(t,n\r,p)-' 



n=0 



t 



TABLE IV. Transition ratio p c of the AvH model. We determine p c by the condition that the 
self-consistent equation (eq.(|12j)) has multiple solutions for p > p c . 



EXP. 


r 


Pc(r) r p c (r) 


I 


C 


86.0% M 95.7% 


II 


C 


86.5% M 96.7% 



We are interested in the limit value of Z(t\r,p) as t — > oo, which we denote as z: 

z = lim Z(t\r,p). 



t- 



In the one-peak phase, Z(t\r, p) always converges to a value larger than half, which we denote 
as z + . In the two-peak phase, in addition to z + , Z(t\r,p) converges to a value smaller than 
half, which we denote as z_, with some positive probability. One cannot predict to which 
fixed point the system converges. It is a probabilistic process. To determine threshold value 
p c between these phases and limit value z±, one way is to solve the following self-consistent 
equation 18| : 

z = q(t,t-z\r,p). (12) 

Given p, if there is only one solution, it is z + and the system is in the one-peak phase. 
Convergence exponent 7 is obtained by estimating the slope of q(t,z ■ t\r,p) at z = z + 



18l . |26j. If there are three solutions, which we denote as z\ < z u < z 2 , Z\ (z?) corresponds 
to z_ (%+)■ Middle solution z u is an unstable state and Z(t\r,p) departs from z u as t 
increases. The method gives the rigorous values for the LH model. In the AvH model, 
we think it works in case C where the self-consistent equation has at most three solutions 
(Figure E3A). In case M, the situation is uncertain. The self-consistent equation has five 
solutions at p — 1.0 and it is not clear whether the above theoretical analysis does work. 
With these notes in mind, we show p c in Table HVl In case C, p c {C) is from 86.0% (EXP-I) 
to 86.5% (EXP-II). In case M, p c (M) is from 95.7% (EXP-I) to 96.7% (EXP-II). 

In order to check the above results, we solve the master equation recursively and obtain 
P(t,n\r,p) for t <T = 10 6 for EXP-I. We estimate convergence exponent 7 from the slope 
of Vai(Z(t\r,p)) as 

7 = 1 ° g Vax(Z(r|r,p)) /l0g T-AT' (13) 

For time horizons T = 60, we take AT = 50 to match the analysis of the experimental data 
in Figures HJ\ and B. For T = 10 3 and 10 6 , we take AT = 10 2 . In order to give the error 
bar of 7 for the experimental results, we adop ted the voting model to simulate the system 



and estimate the 95% confidence interval [16|. The results are summarized in Figure [7] for 



(A) case C and (B) case M. For T = 60, the model describes the experimental results well. 
7 shows non-monotonic behavior as a function of p, it is an artifact of a finite T. In limit 
T — > 00, 7 monotonically decreases from 1 to 0. For T = 10 6 , 7 becomes less than 10 -2 at 
p = 0.87 (0.99) in case C (M) and gives the estimate of p c (r). These values are consistent 
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FIG. 7. Plot of 7 vs p. We plot the results of the AvH model for EXP-I for (A) Case C and (B) 
for Case M. Symbol (o) denotes 7s vs p avg in EXP-I, which are estimated in Figured! The lines 
show the results of the stochastic model with system size T = 60 (thin solid), 10 3 (thin dashed), 
and 10 6 (thick solid). 



with the ones given by the self-consistent equation in Table HVl For p < p c (r) (p > p c (r)), 
the system in case r e {C, M} is in the one-peak (two-peak) phase. 
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FIG. 8. Plot of herders' performance, (E(Z(T\r)) — (1 — p))/p vs p for the voting model. Symbol 
o (A) indicates the experimental data for the four bins 1(6), 1(7), 1(8), and 1(9) in Table ITI1 for 
case C (M). The lines show the results of the stochastic model with system size T = 60, r = C 
(thin solid), T = 60, r = M (thin dashed), 10 6 , r = C (thick solid), and 10 6 , r = M (thick dashed). 



B. Performance of herders 



In order to compare the performances of herders in cases C and case M, we estimate 



the probability of choosing the correct answer by herders as a function of p 29 J . As for the 
model, it can be estimated using the expectation value of Z(t\r) as 

E(Z(t\r,p)-(l-p))/p). 

For the experimental data, we take the average of (Z(i\r) — (1 —p(i)))/p(i) over the samples 
in /(No.): 

-i- Yl (Z(i\r)-(1- P (i)))/P(i)- 
1 °' j ie/(No.) 

We plot the results in Figure The experimental results show that the performance of 
herders in case C is better than that in case M except for the samples in 1(6). As system 
size T increases, for p < pc(C), the performance in case C is better than that in case M. 
However, as p exceeds p c (C), the former rapidly decreases and dips below the latter. From 
the information cascade transition, herders' performance is much lowered and this results in 
the poor performance. In contrast, the poor performance of herders in case M for p < pc(C) 
comes from the saturation of qh(t,n\\M) at about 3/4, the favorite-longshot bias. From the 
saturation, the performance of herders in case M cannot reach the high value. As a result, 
the performance in case M falls below that in case C for p < pc(C). 
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VI. CONCLUSIONS 



Social influence, which here is restricted only to information regarding the choices of 
others, yields inaccuracy in the wisdom of crowd [2| . If a herder receives summary statistics 
{Ca, Cb} and the payoff for the correct choice is constant, they strongy tends to copy the 
majority. The correct information given by independent voters are buried below the herd 
and the majority choice does not necessarily teach us the correct one if herders' proportion 
exceeds p c (C) [l6|. By setting the return to be proportional to multipliers {Ma, M b } that 
are inversely proportional to summary statistics {Ca, Cb}, the situation changes drastically. 
In this case, the optimized behavior is that of an analog herder in game theory. An analog 
herder chooses a G {A, B} with probability proportional to C a . If a herder behaves as an 
analog herder, the phase transition to the two-peak phase can be avoided and the majority 
choice does converge to the correct one in the thermodynamic limit 23[. We studied herders' 
behavior under the influence of multipliers {Ma, Mb} and showed that they behave almost 
as analog herders for 4/3 < m < 4, where m is the multiplier. Outside the region, we see 



favorite- longshot bias [19J, and observe that herders' copy probability qh(t,n\\M) deviates 
from that of analog herders', qh(t,rii) = n\jt. As a result, the threshold value p c of the 
information cascade phase transition becomes extremely large. 

The system size and number of samples in our experiment are very limited and it is 
difficult to estimate p c precisely. In addition, in our experimental setup, the subjects have 
to choose between A and B. The optimized behavior can be adopted only collectively. An 
interesting problem is whether people can adopt the optimized behavior at the individual 
level or only collectively. In order to clarify this, we need to permit people to divide their 
choice and vote fractionally. If the fraction is proportional to the summary statistic of 
previous subjects' choices, it suggests that people can adopt the optimized behavior at the 
individual level. We think that more extensive experimental study of the system and of the 
related systems deserve further attention [3(|. Such experimental studies should provide 
new approach to econophysics (22|, 31 -35] and socio-physicsj36|]. 
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Appendix A: Additional information about Experiment 

In EXP-I, 120 subjects were recruited from the Literature Department of Hokkaido Uni- 
versity. In order to study the effect of social information on the choices of the subjects, 
it was necessary to control the transmission of information from others. We developed a 
web-based voting system by which multiple subjects could simultaneously participate in the 
experiment. The subjects used a web browser to access the web voting server in the intranet. 
Social information about others' choices was shown on the monitor. 

Using slides, we showed subjects how the experiment would proceed. We explained that 
we were studying how their choices were affected by the choices of others. In particular, 
we emphasized that social information was realistic information calculated from the choices 
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of previous subjects. Through the slides, we also explained how to calculate multipliers 
{Ma, Mb} in case M, with a concrete example. After the explanation, the experiment 
started. The subjects answered the 120 questions in the three cases within about one hour. 
Subjects were paid in cash upon being released from the session. There was a 500 yen (about 
6 dollars) participation fee and additional rewards that were proportional to the number of 
points gained. In cases O and C, one correct choice was worth two points, and one point 
was worth one yen (about 1~ cents). In case M, one correct choice was worth the multiplier 



itself. As for EXP-II, detailed information can be obtained from [16 



Appendix B: Quiz selection 

We have used the same 120 questions in EXP-I and EXP-II. For the selection process, 



please refer to our previous paper [16|. Here, we study whether the difficulty of a question is 



an inherent property or not. For this purpose, we compare the percentage of correct answers 
to each question in case O in Group A and that in Group B. It is defined for Group A as 
Z{i\0) = TZLxX(i,s\0)/Ti and for Group B as Z{i + 120|O) = £2+ 120 X (h s\O)/T i+120 . 
We show the scatter plot {Z(i\0), Z{i + 12O|0)} in Figure EJ 
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FIG. 9. Scatter plots of Z(i\0) vs Z{i + 120|O) in EXP-I. Pearson's correlation coefficient p is 
0.8997. 



As one can clearly see the distribution almost on the diagonal line, we can infer that 
there is strong correlation. Pearson's correlation coefficient p is about 0.90. In EXP-II, 
we observe the same feature and p is about 0.82. The strong correlation means that if a 
question is difficult (easy) for the subjects in a group, it would also be difficult (easy) for 
the subjects in the other group. The system sizes in our experiments are very limited and 
there remains some fluctuation in the estimation of Z(i\0), but it will disappear for a large 
system. We can control the difficulties of the questions in the experiment and study the 
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response of a subject under controlability. This aspect is important when one makes some 
prediction based on the results presented in this paper. 



Appendix C: Estimation procedure of parameters {A r ,a r } of the LH model 

We describe the estimation procedure of parameters {A r , a r } of the LH model. We use 
standard maximum likelihood estimation 11] . Given Ci(i,t\r) = ni, the probability that 
X(i, t + l|r) takes 1 is 

Prob(X(i,t + l|r) = l\d(i,t\r) = m) 
= (l-p{i))-l+p(i)-q h {t,n l \r). (CI) 

We write the probability as 

q(l\t,ni,r) = Prob(X(i,t + l\r) = l\d(i,t\r) = ni). 
The probability that X(i,t + l|r) takes under condition Ci(i,t\r) = ri\ is 

Prob(X(z,t + ljr) = 0|Ci(i,t|r) = n x ) 
= p(i) ■ (1 - q h {t, m)) = 1 - q(l\t, n 1; r), (C2) 

which we write as 

q(p\t,m,r) = Prob(X(z,t + l|r) = 0|(7i(z,t|r) = rii). 
The likelihood of a particular sequence of choices {X(z, t|r)} t=1 ... ^ is simply 

l({X(i,t\r)} t=1 ,..., Ti ) 

= H q(X(i, t\r) \t - 1, t - l|r) , r). (C3) 
t=i 

Finally, assuming independence across sequences, the likelihood of observing a set of se- 
quences {X(i, t\r) }t=i,- ..,Tj, i G -J 7 is just 

L({X(M|r)} t=1 ,., Ti ,z g /') = n^WMlr)}^!,.,^). (C4) 



For comparing models, we adopt AIC [37|]. We denote the number of parameters of a model 
by N p . AIC is defined using logL as 

AIC = — 2 • log L + 2 • N p . (C5) 
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